Corpus: swa_wikipedia_2021_100K

Other corpora

5.1.18 Words nearly always as next neighbors

Strong NN co-occurrences with a low probability of being separated

The quotient below is calculated as freq(word1)*freq(word1)/NN_freq^2.

Word 1 Word 1 Frequency of word 1 Frequency of word 2 Frequency as NN Qoutient
wakazi wapatao 3663 2566 2541 1.46
sensa iliyofanyika 3073 2499 2486 1.24
Mpaka uishe 363 313 309 1.19
uishe zinabaki 313 313 310 1.02
Dar es 253 216 215 1.18
es Salaam 216 207 198 1.14
hip hop 106 107 100 1.13
Peace Corps 57 53 46 1.43
Bin Laden 45 51 42 1.30
Los Angeles 46 41 41 1.12
Khmer Rouge 38 39 34 1.28
tutuko zosta 24 30 24 1.25
Hong Kong 25 26 24 1.13
Sierra Leone 24 25 21 1.36
Pol Pot 25 25 22 1.29
Cote d'Ivoire 23 23 21 1.20
Buenos Aires 17 18 17 1.06
Boko Haram 18 18 17 1.12
Lakes Plain” 19 17 17 1.12
Death Row 20 16 16 1.25
260 msec needed at 2023-02-15 13:05